Overview
Brought to you by YData
Dataset statistics
| Number of variables | 8 |
|---|---|
| Number of observations | 1000 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 62.6 KiB |
| Average record size in memory | 64.1 B |
Variable types
| Text | 1 |
|---|---|
| Numeric | 6 |
| Categorical | 1 |
Credit_Score is highly overall correlated with Label_LoanDefault | High correlation |
Label_LoanDefault is highly overall correlated with Credit_Score | High correlation |
CustomerID has unique values | Unique |
Reproduction
| Analysis started | 2025-10-12 00:15:33.677835 |
|---|---|
| Analysis finished | 2025-10-12 00:15:36.578344 |
| Duration | 2.9 seconds |
| Software version | ydata-profiling vv4.17.0 |
| Download configuration | config.json |
Variables
CustomerID
Text
Unique
| Distinct | 1000 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.9 KiB |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 9 |
| Min length | 9 |
Unique
| Unique | 1000 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | CUST00001 |
|---|---|
| 2nd row | CUST00002 |
| 3rd row | CUST00003 |
| 4th row | CUST00004 |
| 5th row | CUST00005 |
| Value | Count | Frequency (%) |
| cust00009 | 1 | 0.1% |
| cust01000 | 1 | 0.1% |
| cust00001 | 1 | 0.1% |
| cust00002 | 1 | 0.1% |
| cust00003 | 1 | 0.1% |
| cust00004 | 1 | 0.1% |
| cust00005 | 1 | 0.1% |
| cust00006 | 1 | 0.1% |
| cust00985 | 1 | 0.1% |
| cust00986 | 1 | 0.1% |
| Other values (990) | 990 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 2299 | |
| C | 1000 | |
| U | 1000 | |
| S | 1000 | |
| T | 1000 | |
| 1 | 301 | 3.3% |
| 2 | 300 | 3.3% |
| 3 | 300 | 3.3% |
| 4 | 300 | 3.3% |
| 5 | 300 | 3.3% |
| Other values (4) | 1200 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 9000 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 2299 | |
| C | 1000 | |
| U | 1000 | |
| S | 1000 | |
| T | 1000 | |
| 1 | 301 | 3.3% |
| 2 | 300 | 3.3% |
| 3 | 300 | 3.3% |
| 4 | 300 | 3.3% |
| 5 | 300 | 3.3% |
| Other values (4) | 1200 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 9000 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 2299 | |
| C | 1000 | |
| U | 1000 | |
| S | 1000 | |
| T | 1000 | |
| 1 | 301 | 3.3% |
| 2 | 300 | 3.3% |
| 3 | 300 | 3.3% |
| 4 | 300 | 3.3% |
| 5 | 300 | 3.3% |
| Other values (4) | 1200 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 9000 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 2299 | |
| C | 1000 | |
| U | 1000 | |
| S | 1000 | |
| T | 1000 | |
| 1 | 301 | 3.3% |
| 2 | 300 | 3.3% |
| 3 | 300 | 3.3% |
| 4 | 300 | 3.3% |
| 5 | 300 | 3.3% |
| Other values (4) | 1200 |
Recency_Days
Real number (ℝ)
| Distinct | 342 |
|---|---|
| Distinct (%) | 34.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 183.588 |
| Minimum | 2 |
|---|---|
| Maximum | 364 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.9 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 20 |
| Q1 | 91 |
| median | 185 |
| Q3 | 276.25 |
| 95-th percentile | 346.05 |
| Maximum | 364 |
| Range | 362 |
| Interquartile range (IQR) | 185.25 |
Descriptive statistics
| Standard deviation | 105.05834 |
|---|---|
| Coefficient of variation (CV) | 0.5722506 |
| Kurtosis | -1.2085521 |
| Mean | 183.588 |
| Median Absolute Deviation (MAD) | 93 |
| Skewness | -0.030661018 |
| Sum | 183588 |
| Variance | 11037.256 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 269 | 8 | 0.8% |
| 35 | 8 | 0.8% |
| 158 | 8 | 0.8% |
| 163 | 7 | 0.7% |
| 350 | 7 | 0.7% |
| 232 | 7 | 0.7% |
| 285 | 7 | 0.7% |
| 44 | 6 | 0.6% |
| 341 | 6 | 0.6% |
| 169 | 6 | 0.6% |
| Other values (332) | 930 |
| Value | Count | Frequency (%) |
| 2 | 5 | |
| 3 | 4 | |
| 4 | 1 | 0.1% |
| 5 | 2 | 0.2% |
| 6 | 3 | |
| 7 | 2 | 0.2% |
| 8 | 3 | |
| 9 | 4 | |
| 10 | 2 | 0.2% |
| 11 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 364 | 2 | |
| 363 | 4 | |
| 362 | 2 | |
| 361 | 2 | |
| 360 | 1 | 0.1% |
| 359 | 1 | 0.1% |
| 358 | 1 | 0.1% |
| 357 | 4 | |
| 356 | 1 | 0.1% |
| 355 | 3 |
Frequency
Real number (ℝ)
| Distinct | 49 |
|---|---|
| Distinct (%) | 4.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 24.851 |
| Minimum | 1 |
|---|---|
| Maximum | 49 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 13 |
| median | 25 |
| Q3 | 37 |
| 95-th percentile | 47 |
| Maximum | 49 |
| Range | 48 |
| Interquartile range (IQR) | 24 |
Descriptive statistics
| Standard deviation | 14.288841 |
|---|---|
| Coefficient of variation (CV) | 0.57498051 |
| Kurtosis | -1.207634 |
| Mean | 24.851 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | 0.025301161 |
| Sum | 24851 |
| Variance | 204.17097 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=49)
| Value | Count | Frequency (%) |
| 42 | 32 | 3.2% |
| 3 | 29 | 2.9% |
| 17 | 27 | 2.7% |
| 49 | 26 | 2.6% |
| 37 | 25 | 2.5% |
| 1 | 25 | 2.5% |
| 12 | 25 | 2.5% |
| 6 | 24 | 2.4% |
| 26 | 24 | 2.4% |
| 21 | 24 | 2.4% |
| Other values (39) | 739 |
| Value | Count | Frequency (%) |
| 1 | 25 | |
| 2 | 16 | |
| 3 | 29 | |
| 4 | 16 | |
| 5 | 18 | |
| 6 | 24 | |
| 7 | 20 | |
| 8 | 23 | |
| 9 | 19 | |
| 10 | 19 |
| Value | Count | Frequency (%) |
| 49 | 26 | |
| 48 | 19 | |
| 47 | 17 | |
| 46 | 22 | |
| 45 | 17 | |
| 44 | 23 | |
| 43 | 20 | |
| 42 | 32 | |
| 41 | 18 | |
| 40 | 14 |
Monetary_Value
Real number (ℝ)
| Distinct | 949 |
|---|---|
| Distinct (%) | 94.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5084.863 |
| Minimum | 105 |
|---|---|
| Maximum | 9994 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.9 KiB |
Quantile statistics
| Minimum | 105 |
|---|---|
| 5-th percentile | 713.55 |
| Q1 | 2642.25 |
| median | 4968 |
| Q3 | 7572.25 |
| 95-th percentile | 9561.4 |
| Maximum | 9994 |
| Range | 9889 |
| Interquartile range (IQR) | 4930 |
Descriptive statistics
| Standard deviation | 2860.7018 |
|---|---|
| Coefficient of variation (CV) | 0.56259171 |
| Kurtosis | -1.2056201 |
| Mean | 5084.863 |
| Median Absolute Deviation (MAD) | 2518.5 |
| Skewness | 0.038142394 |
| Sum | 5084863 |
| Variance | 8183614.6 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 4671 | 3 | 0.3% |
| 8248 | 3 | 0.3% |
| 8317 | 2 | 0.2% |
| 9165 | 2 | 0.2% |
| 4599 | 2 | 0.2% |
| 4035 | 2 | 0.2% |
| 2967 | 2 | 0.2% |
| 1804 | 2 | 0.2% |
| 3272 | 2 | 0.2% |
| 8724 | 2 | 0.2% |
| Other values (939) | 978 |
| Value | Count | Frequency (%) |
| 105 | 1 | |
| 110 | 1 | |
| 112 | 1 | |
| 145 | 1 | |
| 146 | 1 | |
| 154 | 1 | |
| 163 | 1 | |
| 174 | 1 | |
| 193 | 1 | |
| 271 | 1 |
| Value | Count | Frequency (%) |
| 9994 | 1 | |
| 9986 | 1 | |
| 9967 | 2 | |
| 9948 | 1 | |
| 9936 | 1 | |
| 9924 | 1 | |
| 9921 | 1 | |
| 9918 | 1 | |
| 9908 | 1 | |
| 9904 | 1 |
Loan_Amount
Real number (ℝ)
| Distinct | 993 |
|---|---|
| Distinct (%) | 99.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 25305.592 |
| Minimum | 528 |
|---|---|
| Maximum | 49961 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.9 KiB |
Quantile statistics
| Minimum | 528 |
|---|---|
| 5-th percentile | 3704.95 |
| Q1 | 12462.25 |
| median | 25235 |
| Q3 | 37586 |
| 95-th percentile | 47830.1 |
| Maximum | 49961 |
| Range | 49433 |
| Interquartile range (IQR) | 25123.75 |
Descriptive statistics
| Standard deviation | 14157.721 |
|---|---|
| Coefficient of variation (CV) | 0.55947008 |
| Kurtosis | -1.2027302 |
| Mean | 25305.592 |
| Median Absolute Deviation (MAD) | 12499.5 |
| Skewness | 0.026866855 |
| Sum | 25305592 |
| Variance | 2.0044108 × 108 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 46970 | 2 | 0.2% |
| 45934 | 2 | 0.2% |
| 37688 | 2 | 0.2% |
| 49942 | 2 | 0.2% |
| 20442 | 2 | 0.2% |
| 5688 | 2 | 0.2% |
| 32378 | 2 | 0.2% |
| 46531 | 1 | 0.1% |
| 48049 | 1 | 0.1% |
| 35392 | 1 | 0.1% |
| Other values (983) | 983 |
| Value | Count | Frequency (%) |
| 528 | 1 | |
| 655 | 1 | |
| 713 | 1 | |
| 764 | 1 | |
| 894 | 1 | |
| 936 | 1 | |
| 1000 | 1 | |
| 1045 | 1 | |
| 1096 | 1 | |
| 1214 | 1 |
| Value | Count | Frequency (%) |
| 49961 | 1 | |
| 49942 | 2 | |
| 49934 | 1 | |
| 49806 | 1 | |
| 49743 | 1 | |
| 49730 | 1 | |
| 49706 | 1 | |
| 49679 | 1 | |
| 49653 | 1 | |
| 49518 | 1 |
Credit_Score
Real number (ℝ)
High correlation
| Distinct | 462 |
|---|---|
| Distinct (%) | 46.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 568.325 |
| Minimum | 300 |
|---|---|
| Maximum | 849 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.9 KiB |
Quantile statistics
| Minimum | 300 |
|---|---|
| 5-th percentile | 324 |
| Q1 | 438.75 |
| median | 562 |
| Q3 | 713 |
| 95-th percentile | 821.05 |
| Maximum | 849 |
| Range | 549 |
| Interquartile range (IQR) | 274.25 |
Descriptive statistics
| Standard deviation | 160.75334 |
|---|---|
| Coefficient of variation (CV) | 0.28285459 |
| Kurtosis | -1.2120624 |
| Mean | 568.325 |
| Median Absolute Deviation (MAD) | 140 |
| Skewness | 0.069738381 |
| Sum | 568325 |
| Variance | 25841.635 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 720 | 8 | 0.8% |
| 404 | 7 | 0.7% |
| 716 | 6 | 0.6% |
| 444 | 6 | 0.6% |
| 760 | 5 | 0.5% |
| 811 | 5 | 0.5% |
| 337 | 5 | 0.5% |
| 577 | 5 | 0.5% |
| 842 | 5 | 0.5% |
| 351 | 5 | 0.5% |
| Other values (452) | 943 |
| Value | Count | Frequency (%) |
| 300 | 1 | 0.1% |
| 301 | 2 | |
| 302 | 1 | 0.1% |
| 303 | 2 | |
| 304 | 4 | |
| 306 | 2 | |
| 307 | 2 | |
| 308 | 2 | |
| 309 | 1 | 0.1% |
| 310 | 2 |
| Value | Count | Frequency (%) |
| 849 | 1 | 0.1% |
| 848 | 2 | 0.2% |
| 847 | 1 | 0.1% |
| 846 | 2 | 0.2% |
| 845 | 2 | 0.2% |
| 844 | 4 | |
| 843 | 2 | 0.2% |
| 842 | 5 | |
| 841 | 1 | 0.1% |
| 840 | 1 | 0.1% |
Age
Real number (ℝ)
| Distinct | 62 |
|---|---|
| Distinct (%) | 6.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 48.514 |
| Minimum | 18 |
|---|---|
| Maximum | 79 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.9 KiB |
Quantile statistics
| Minimum | 18 |
|---|---|
| 5-th percentile | 21 |
| Q1 | 33 |
| median | 49 |
| Q3 | 64 |
| 95-th percentile | 76 |
| Maximum | 79 |
| Range | 61 |
| Interquartile range (IQR) | 31 |
Descriptive statistics
| Standard deviation | 17.573236 |
|---|---|
| Coefficient of variation (CV) | 0.3622302 |
| Kurtosis | -1.1749324 |
| Mean | 48.514 |
| Median Absolute Deviation (MAD) | 15.5 |
| Skewness | 0.0032558033 |
| Sum | 48514 |
| Variance | 308.81862 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 33 | 27 | 2.7% |
| 52 | 24 | 2.4% |
| 50 | 24 | 2.4% |
| 30 | 23 | 2.3% |
| 69 | 22 | 2.2% |
| 51 | 21 | 2.1% |
| 75 | 21 | 2.1% |
| 27 | 20 | 2.0% |
| 72 | 20 | 2.0% |
| 58 | 19 | 1.9% |
| Other values (52) | 779 |
| Value | Count | Frequency (%) |
| 18 | 13 | |
| 19 | 17 | |
| 20 | 12 | |
| 21 | 17 | |
| 22 | 11 | |
| 23 | 16 | |
| 24 | 14 | |
| 25 | 18 | |
| 26 | 13 | |
| 27 | 20 |
| Value | Count | Frequency (%) |
| 79 | 19 | |
| 78 | 10 | |
| 77 | 9 | |
| 76 | 16 | |
| 75 | 21 | |
| 74 | 18 | |
| 73 | 12 | |
| 72 | 20 | |
| 71 | 12 | |
| 70 | 14 |
Label_LoanDefault
Categorical
High correlation
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.9 KiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 1 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 519 | |
| 0 | 481 |
Length
Histogram of lengths of the category
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1 | 519 | |
| 0 | 481 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 519 | |
| 0 | 481 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1000 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 519 | |
| 0 | 481 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1000 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 519 | |
| 0 | 481 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1000 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 519 | |
| 0 | 481 |
Interactions
Correlations
| Age | Credit_Score | Frequency | Label_LoanDefault | Loan_Amount | Monetary_Value | Recency_Days | |
|---|---|---|---|---|---|---|---|
| Age | 1.000 | 0.026 | -0.019 | 0.035 | -0.014 | -0.045 | 0.010 |
| Credit_Score | 0.026 | 1.000 | 0.023 | 0.589 | -0.029 | -0.034 | -0.008 |
| Frequency | -0.019 | 0.023 | 1.000 | 0.000 | 0.056 | 0.008 | -0.053 |
| Label_LoanDefault | 0.035 | 0.589 | 0.000 | 1.000 | 0.065 | 0.048 | 0.000 |
| Loan_Amount | -0.014 | -0.029 | 0.056 | 0.065 | 1.000 | -0.022 | -0.002 |
| Monetary_Value | -0.045 | -0.034 | 0.008 | 0.048 | -0.022 | 1.000 | -0.013 |
| Recency_Days | 0.010 | -0.008 | -0.053 | 0.000 | -0.002 | -0.013 | 1.000 |
Missing values
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
Sample
| CustomerID | Recency_Days | Frequency | Monetary_Value | Loan_Amount | Credit_Score | Age | Label_LoanDefault | |
|---|---|---|---|---|---|---|---|---|
| 0 | CUST00001 | 33 | 40 | 8793 | 39012 | 474 | 30 | 1 |
| 1 | CUST00002 | 282 | 37 | 714 | 33756 | 763 | 20 | 0 |
| 2 | CUST00003 | 239 | 28 | 4035 | 46531 | 319 | 56 | 0 |
| 3 | CUST00004 | 160 | 44 | 4636 | 48049 | 545 | 60 | 1 |
| 4 | CUST00005 | 158 | 7 | 7999 | 35392 | 740 | 65 | 0 |
| 5 | CUST00006 | 313 | 46 | 1377 | 18860 | 824 | 76 | 0 |
| 6 | CUST00007 | 32 | 42 | 4397 | 30153 | 438 | 48 | 1 |
| 7 | CUST00008 | 254 | 25 | 1608 | 21541 | 657 | 69 | 1 |
| 8 | CUST00009 | 74 | 36 | 7897 | 31640 | 404 | 62 | 1 |
| 9 | CUST00010 | 35 | 26 | 6359 | 40700 | 363 | 35 | 1 |
| CustomerID | Recency_Days | Frequency | Monetary_Value | Loan_Amount | Credit_Score | Age | Label_LoanDefault | |
|---|---|---|---|---|---|---|---|---|
| 990 | CUST00991 | 125 | 27 | 4614 | 37846 | 656 | 21 | 1 |
| 991 | CUST00992 | 200 | 22 | 6415 | 37811 | 587 | 47 | 0 |
| 992 | CUST00993 | 72 | 15 | 2011 | 7442 | 845 | 69 | 0 |
| 993 | CUST00994 | 158 | 32 | 8576 | 14613 | 603 | 58 | 1 |
| 994 | CUST00995 | 193 | 37 | 7728 | 25817 | 653 | 26 | 0 |
| 995 | CUST00996 | 228 | 1 | 6347 | 17084 | 416 | 31 | 1 |
| 996 | CUST00997 | 192 | 28 | 3086 | 1764 | 710 | 70 | 1 |
| 997 | CUST00998 | 132 | 49 | 3541 | 42362 | 750 | 78 | 0 |
| 998 | CUST00999 | 351 | 14 | 4406 | 36698 | 673 | 64 | 0 |
| 999 | CUST01000 | 187 | 14 | 6659 | 26243 | 574 | 24 | 0 |